Search results for "Combinatorics on words"

showing 10 items of 49 documents

Block Sorting-Based Transformations on Words: Beyond the Magic BWT

2018

The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression and later results have contributed to make it a fundamental tool for the design of self-indexing compressed data structures. The Alternating Burrows-Wheeler Transform (ABWT) is a more recent transformation, studied in the context of Combinatorics on Words, that works in a similar way, using an alternating lexicographical order instead of the usual one. In this paper we study a more general class of block sorting-based transformations. The transformations in this new class prove to be interesting combinatorial tools that offer new research perspectives. In particular, we show that all the tra…

0301 basic medicineSettore INF/01 - InformaticaComputer scienceData_CODINGANDINFORMATIONTHEORY0102 computer and information sciencesBlock sortingData structureLexicographical order01 natural sciencesUpper and lower bounds03 medical and health sciencesCombinatorics on words030104 developmental biology010201 computation theory & mathematicsArithmeticCompressed Data Structures Block Sorting Combinatorics on Words AlgorithmsData compression
researchProduct

Bacteria classification using minimal absent words

2017

Bacteria classification has been deeply investigated with different tools for many purposes, such as early diagnosis, metagenomics, phylogenetics. Classification methods based on ribosomal DNA sequences are considered a reference in this area. We present a new classificatier for bacteria species based on a dissimilarity measure of purely combinatorial nature. This measure is based on the notion of Minimal Absent Words, a combinatorial definition that recently found applications in bioinformatics. We can therefore incorporate this measure into a probabilistic neural network in order to classify bacteria species. Our approach is motivated by the fact that there is a vast literature on the com…

0301 basic medicinesupervised classificationRelation (database)Computer science0102 computer and information sciences01 natural sciencesMeasure (mathematics)03 medical and health sciencesProbabilistic neural networkcombinatorics on wordsprobabilistic neural networkminimal absent wordlcsh:R5-920Settore INF/01 - Informaticabusiness.industryBacterial taxonomyPattern recognitionbacteria classificationGeneral MedicineCombinatorics on words030104 developmental biology010201 computation theory & mathematicsMetagenomicsClassification methodsArtificial intelligencebusinesslcsh:Medicine (General)AIMS Medical Science
researchProduct

Quasi-linear time computation of the abelian periods of a word

2012

Abelian period Abelian repetition weak repetition design of algorithms text algorithms combinatorics on words
researchProduct

Computing abelian periods in words

2011

International audience

Abelian period Abelian repetition weak repetition design of algorithms text algorithms combinatorics on words[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]ComputingMilieux_MISCELLANEOUS
researchProduct

A NEW COMPLEXITY FUNCTION FOR WORDS BASED ON PERIODICITY

2013

Motivated by the extension of the critical factorization theorem to infinite words, we study the (local) periodicity function, i.e. the function that, for any position in a word, gives the size of the shortest square centered in that position. We prove that this function characterizes any binary word up to exchange of letters. We then introduce a new complexity function for words (the periodicity complexity) that, for any position in the word, gives the average value of the periodicity function up to that position. The new complexity function is independent from the other commonly used complexity measures as, for instance, the factor complexity. Indeed, whereas any infinite word with bound…

Average-case complexityDiscrete mathematicsFibonacci numberSettore INF/01 - InformaticaGeneral Mathematicscomplexity functionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Function (mathematics)periodicitycritical factorization theoremCombinatoricsComplexity indexCombinatorics on wordsBounded functionComplexity functionComputer Science::Formal Languages and Automata TheoryWord (computer architecture)Combinatorics on wordMathematicsInternational Journal of Algebra and Computation
researchProduct

Languages with mismatches

2007

AbstractIn this paper we study some combinatorial properties of a class of languages that represent sets of words occurring in a text S up to some errors. More precisely, we consider sets of words that occur in a text S with k mismatches in any window of size r. The study of this class of languages mainly focuses both on a parameter, called repetition index, and on the set of the minimal forbidden words of the language of factors of S with errors. The repetition index of a string S is defined as the smallest integer such that all strings of this length occur at most in a unique position of the text S up to errors. We prove that there is a strong relation between the repetition index of S an…

Combinatorics on wordsApproximate string matchingGeneral Computer ScienceRepetition (rhetorical device)String (computer science)Search engine indexingComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Approximate string matchingData structureTheoretical Computer ScienceCombinatoricsSet (abstract data type)Formal languagesCombinatorics on words Formal languages Approximate string matching IndexingIndexingWord (group theory)MathematicsInteger (computer science)Computer Science(all)Theoretical Computer Science
researchProduct

Burrows-Wheeler transform and palindromic richness

2009

AbstractThe investigation of the extremal case of the Burrows–Wheeler transform leads to study the words w over an ordered alphabet A={a1,a2,…,ak}, with a1<a2<⋯<ak, such that bwt(w) is of the form aknkak−1nk−1⋯a2n2a1n1, for some non-negative integers n1,n2,…,nk. A characterization of these words in the case |A|=2 has been given in [Sabrina Mantaci, Antonio Restivo, Marinella Sciortino, Burrows-Wheeler transform and Sturmian words, Information Processing Letters 86 (2003) 241–246], where it is proved that they correspond to the powers of conjugates of standard words. The case |A|=3 has been settled in [Jamie Simpson, Simon J. Puglisi, Words with simple Burrows-Wheeler transforms, Electronic …

Combinatorics on wordsGeneral Computer ScienceBurrows–Wheeler transformSettore INF/01 - InformaticaRich wordsPalindromeBurrows-Wheeler transformTheoretical Computer ScienceCombinatoricsRich wordBurrows-Wheeler transform; Palindromes; Rich words; Combinatorics on wordsPalindromePalindromesSpecies richnessAlphabetArithmeticBurrows–Wheeler transformComputer Science(all)MathematicsCombinatorics on word
researchProduct

Bounded Bi-ideals and Linear Recurrence

2013

Bounded bi-ideals are a subclass of uniformly recurrent words. We introduce the notion of completely bounded bi-ideals by imposing a restriction on their generating base sequences. We prove that a bounded bi-ideal is linearly recurrent if and only if it is completely bounded.

CombinatoricsCombinatorics on wordsMathematics::Commutative AlgebraBounded setBounded functionBase (topology)Bounded inverse theoremBounded operatorMathematics2013 15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
researchProduct

"Indexing structures for approximate string matching

2003

In this paper we give the first, to our knowledge, structures and corresponding algorithms for approximate indexing, by considering the Hamming distance, having the following properties. i) Their size is linear times a polylog of the size of the text on average. ii) For each pattern x, the time spent by our algorithms for finding the list occ(x) of all occurrences of a pattern x in the text, up to a certain distance, is proportional on average to |x| + |occ(x)|, under an additional but realistic hypothesis.

CombinatoricsCombinatorics on wordsPattern recognition (psychology)Search engine indexingAutomata theoryHamming distanceString searching algorithmApproximate string matchingTime complexityMathematics
researchProduct

Balanced Words Having Simple Burrows-Wheeler Transform

2009

The investigation of the "clustering effect" of the Burrows-Wheeler transform (BWT) leads to study the words having simple BWT , i.e. words w over an ordered alphabet $A=\{a_1,a_2,\ldots,a_k\}$, with $a_1 < a_2 < \ldots <a_k$, such that $bwt(w)$ is of the form $a_k^{n_k} a_{k-1}^{n_{k-1}} \cdots a_1^{n_1}$, for some non-negative integers $n_1, n_2, \ldots, n_k$. We remark that, in the case of binary alphabets, there is an equivalence between words having simple BWT, the family of (circular) balanced words and the conjugates of standard words. In the case of alphabets of size greater than two, there is no more equivalence between these notions. As a main result of this paper we prove that, u…

CombinatoricsConjugacy classClustering effectBurrows–Wheeler transformSettore INF/01 - InformaticaBurrows Wheeler Transform Combinatorics on Words Balanced sequences epistandard rich words words having simple BWTBinary numberBurrows-Wheeler TransformAlphabetBinary alphabetBurrows-Wheeler Transform; Clustering effectMathematics
researchProduct